Introduction


Background

High-density lipoprotein (HDL) cholesterol is known as the ‘good’ cholesterol because it helps remove other forms of cholesterol from bloodstream. Higher levels of HDL cholesterol are associated with a lower risk of heart disease.

Air pollution is known to be one of the leading causes of cardiovascular disease. Emerging evidence suggests that particulate-mediated HDL dysfunction might be a novel mechanism linking air pollution exposure to adverse cardiovascular effects.However, few studies have evaluated the impact of traffic-related air pollution exposure (black carbon, nitrogen dioxide). Similarly, temperature has been linked to cardiovascular disease, but little is known about the underlying mechanisms.

Research objectives and questions

Given the background, the project investigated the association between acute exposure to ambient black carbon (one source of traffic-related air pollution) and ambient temperature and HDL level. Additionally, to account for potential latency in HDL level changes, the project also assessed the association between 1-day and 2-day lag exposure and HDL level. Furthermore, the project also explored the effect modification by potential risk factors.

By visualization and modeling, the project aimed to address the following questions.

  1. Is daily ambient black carbon associated with HDL level? Is there lag effect for the association of interest?

  2. Is daily ambient temperature associated with HDL level? Is there lag effect for the association of interest?

  3. What are potential ‘risk factors’ of HDL changes among study population?

  4. Does these risk factors modify the association between daily ambient black carbon/temperature and HDL level?


Method


Dataset and study population

For data disclosure issue,the project was based on part of the Veterans Administration Normative Aging Study (NAS), which could be used for class. NAS was a longitudinal study established in 1963, the study enrolled 2,280 men from the Greater Boston area, who were aged between 21 to 80 and were determined to be free of known chronic medical conditions by an initial health screening. Participants visited the study center repeatedly for physical examinations, blood pressure measurements, blood sample collection, and questionnaires approximately every four years. Blood samples were used for lipid analysis.

During the follow-up period, black carbon (BC) was measured at a central monitoring site located on the roof of Countway Library, Harvard Medical School, in Boston, MA, temperature was also collected. Single day lags were computed for these pollutants and meteorological variables from the same day of health visits and up to 2 days before the visit.

The dataset included a subset of 981 subjects with a total of 2483 observations. As NAS dataset was well-curated, so the dataset was pretty clean and less subject to missing data issue. For total of 31 variable, RACE had 29 missing observations, and NEDUC had 1 missing value. By looking at the summary statistics of each variable, we did not observe any implausible value for continuous variable or wrongly-coded value for categorical variable. Therefore, we simply excluded the observations with missing observations, as the missing data issue here is trivial. The final main dataset had 968 subjects with a total of 2453 observations from 1995 to 2011.

Data wrangling and transformation

  • We first create a new categorical variable hdlcat to divide the HDL level into 2categories based on clinical recommendation. For easy plotting and modeling, we also level and label the categorical variable. For tabling summary statistics, we created label for key variables.

  • In one of the result section, we would like to show the univariate association of interest by black carbon/temperature on the same day of each visit (lag0), black carbon/temperature on the previous 1 day of each visit (lag1), black carbon/temperature on the previous 2 days of each visit (lag2). However, the lag exposure measures were storaged in ‘wide’ format, thus, we transformed the original dataset (one subject per row) into ‘long’ dataset (one time per row, each subject may have muptiple rows) for easy plotting.

  • The ambient black carbon and temperature exposure were the same for subjects who took hospital vist on the same day, thus, we had duplicates of the two exposure in the dataset. To plot time series of temperature and black carbon during following, we created dataset without duplicates for variables BC24H and TEMPC24H.

  • Last but not least, the dataset produced without further usage was removed to keep environment tidy.

Tools for data exploration

  • The packages used for EDA: skimr.

  • The packages used for data wrangling and transformation: dplyr, tidyr.

  • The packages used for visulization: ggplot2, plotly, ggpubr, table1, kableExtra.


Results


Descriptive statistics

According to clinical cutoff, males with less than 40 mg/dL HDL was at risk of cardiovascular disease. The table showed the basic summary statistics of key variables by the two levels of HDL.

Participants of normal HDL had lower BMI, lower daily black carbon, and lower daily temperature, compared to participants with HDL below the cutoff. The proportion of obeses was higher in participants with HDL at risk. The proportion of diabetes was lower in participants of normal HDL. During the cold season, the HDL levels was more likely to be above the cutoff than during the warm season.

Summary Statistics of (repeated) measurement of characterstics by HDL level
  At risk (<40) Normal (>=40) Overall
(N=729) (N=1724) (N=2453)
AGE (years)
  Mean (SD) 71.7 (7.64) 73.3 (7.31) 72.9 (7.44)
  Median [Min, Max] 72.0 [51.0, 96.0] 73.0 [49.0, 97.0] 73.0 [49.0, 97.0]
BMI (kg/m2)
  Mean (SD) 29.3 (4.49) 27.3 (3.62) 27.9 (4.00)
  Median [Min, Max] 28.5 [19.4, 52.6] 27.0 [16.7, 43.9] 27.4 [16.7, 52.6]
BMI CATEGORY
  Underweight 0 (0%) 6 (0.3%) 6 (0.2%)
  Normal 456 (62.6%) 1369 (79.4%) 1825 (74.4%)
  Overweight 0 (0%) 0 (0%) 0 (0%)
  Obese 273 (37.4%) 349 (20.2%) 622 (25.4%)
RACE
  White 722 (99.0%) 1677 (97.3%) 2399 (97.8%)
  Black 4 (0.5%) 33 (1.9%) 37 (1.5%)
  Hispanic White 3 (0.4%) 10 (0.6%) 13 (0.5%)
  Hispanic Black 0 (0%) 4 (0.2%) 4 (0.2%)
  American Indian 0 (0%) 0 (0%) 0 (0%)
STATIN USE
  No 448 (61.5%) 1041 (60.4%) 1489 (60.7%)
  Yes 281 (38.5%) 683 (39.6%) 964 (39.3%)
DIABETE
  No 578 (79.3%) 1540 (89.3%) 2118 (86.3%)
  Yes 151 (20.7%) 184 (10.7%) 335 (13.7%)
SEASON
  Cold 287 (39.4%) 775 (45.0%) 1062 (43.3%)
  Warm 442 (60.6%) 949 (55.0%) 1391 (56.7%)
SMOKING STATUS
  Never 209 (28.7%) 519 (30.1%) 728 (29.7%)
  Current 26 (3.6%) 71 (4.1%) 97 (4.0%)
  Former 494 (67.8%) 1134 (65.8%) 1628 (66.4%)
SAME DAY BLACK CARBON (ug/m3)
  Mean (SD) 1.08 (0.636) 0.956 (0.552) 0.993 (0.581)
  Median [Min, Max] 0.953 [0.119, 4.05] 0.824 [0.180, 3.56] 0.854 [0.119, 4.05]
SAME DAY TEMPERATURE (°C)
  Mean (SD) 13.3 (9.14) 12.6 (8.54) 12.8 (8.73)
  Median [Min, Max] 14.0 [-11.4, 31.0] 12.8 [-13.9, 31.0] 13.3 [-13.9, 31.0]

Distribution of HDL by different levels of risk factors

To further explore whether some demographic and physiological variables were associated with HDL levels, we displayed violin plot (categorical variables) and scatterplot (continuous variables) to show the distribution of HDL by different levels of potential risk factors, and to identify risk factors of HDL among NAS population.

  • For categorical variables, the different shapes of violin were observed for different levels of STATIN, DIABETE, RACE.

  • For continuous variables, AGE was slightly positively associated with HDL (r = 0.092, p-value < 0.05), and BMI was moderately negatively associated with HDL (r = -0.27, p-value < 0.05).

Statin

Diabetes

Race

Season

Smoking status

Hypertension

Days of the week

Age

BMI

Univariate relationship between daily (lag) ambient black carbon, temperature and HDL

Before plotting, we checked the distribution of HDL and found right skewness.The normality was achieved by log transformation of HDL.Therefore, for the following analyses, we used log(HDL) as outcome measure.

HDL

Log(HDL)

Effect and lag effect of ambient black carbon and modification by risk factors

  • The pattern of associations were similar at lag0 and lag1, which were different at lag2.

  • In terms of overall association, same day black carbon (lag0) , previous day black carbon (lag1), and previous two days black carbon (lag2) seemed to inversely and linearly associated with HDL level. The magnitude of the association was similar across the lags.

  • Effect heterogeneity of diabetes was observed for the association of interest at lag2, where null association found for diabetes.

  • Among people with greater age, the association of interest was stronger than people with age below median at lag0 and lag2.

  • No clear pattern were shown for modification by statin use and BMI.

Overall

Diabetes

Statin Use

BMI

Age

Effect and lag effect of temperature and modification by risk factors

  • The pattern of associations were similar across three lags.

  • In terms of overall association, the relationship was inverted U-shape, with average HDL increasing below around 10 °C and decreasing between temperatures 10 and 30 °C.

  • Effect heterogeneity of diabetes was observed, where null association was found for diabetes.

  • For people using statin use, there was an inverse association between HDL and temperature. However, among people without statin usage, the relationship was inverted U-shape, with average HDL increasing below around 10 °C and decreasing between temperatures 10 and 30 °C.

  • Among people with greater age, there was an inverse association between HDL and temperature. However, among people with age lower than median, the relationship was inverted U-shape, with average HDL increasing below around 10 °C and decreasing between temperatures 10 and 30 °C.

  • No clear pattern were shown for BMI.

Overall

Diabetes

Statin Use

BMI

Age

Conclusion

The relationship between same-day and lagged temperature and HDL levels was inverted U-shape, with average HDL increasing below around 10 °C and decreasing between temperatures 10 and 30 °C. In contrast, we found inverse association between same-day and lagged exposure of ambient black carbon and HDL levels, this was further supported by the result of linear mixed effect model where we did detect a significant effect of same-day black carbon and its two lags on HDL (See Appendix). Furthermore, diabetes and age seemed to modify the association between daily temperature and HDL levels across all the lags assessed. Participants with diabetes showed null association between daily temperature and HDL levels whereas non-diabetes showed U-shaped similar to the main effect pattern. Participants above the median age showed that HDL levels decreased as the temperature increased, but in those below the median age, similar U-shaped was observed.

Copyright © 2022, Yuhong Hu.